Genetics Selection Evolution — Latest Matching Preprints

1

An endogenous retrovirus insertion disrupting bovine ALKBH8 causes a failure-to-thrive syndrome with immunodeficiency associated with juvenile mortality in Brown Swiss cattle

Glatthard, S.; Kadri, N. K.; Seefried, F. R.; Voitl, L. R.; Weber, B. A.; Schwarzenbacher, H.; Meister, S. L.; Gurtner, C.; OGrady, J. F.; Osbahr, M.; Leonard, A. S.; Meylan, M.; Pausch, H.; Droegemueller, C.; Jacinto, J.

2026-07-10 genomics 10.64898/2026.07.09.737535 medRxiv

Top 0.1%

11.8%

Show abstract

The Brown Swiss (BS) cattle breed is one of the major Swiss dairy breeds. Intensive selection and the widespread use of few elite sires in artificial insemination have increased inbreeding and the occurrence of deleterious recessive alleles in the homozygous state. Analyzing life trajectories in large, genotyped cohorts can identify hidden recessive disorders that are difficult to detect using traditional case-control association testing. Long-read DNA sequencing enables precise detection of causal alleles, including structural variants. This study aimed to (1) identify cryptic recessive loci affecting rearing performance in Swiss BS cattle, (2) evaluate their impact on survival, (3) characterize the associated phenotype, (4) identify the causal variant using long-read whole-genome sequencing, and (5) assess its functional impact. Using Homozygous Haplotype Enrichment/Depletion (HHED) mapping, we identified a risk haplotype (BH39) on chromosome 15 spanning from 16,276,819 bp to 16,446,984 bp that was associated with increased juvenile mortality within the first 180 days of life when present in the homozygous state. The BH39 occurred at a frequency of approximately 4.5% in Swiss BS cattle and 5.3% in German and Austrian BS cattle, and homozygous carriers exhibited a significantly reduced first-year survival rate. Five females homozygous for BH39 underwent clinical examination. They all showed recurrent respiratory disease, impaired growth, poor body condition, rough hair coat, and brown-discolored teeth. Pathological examination revealed bronchopneumonia and eosinophilic enteritis. Clinicopathological findings indicated failure to thrive and immunodeficiency. Long-read WGS of two BH39 homozygous calves revealed a private homozygous coding variant that was in high linkage disequilibrium with BH39. The identified structural variant was an insertion of a large transposable element (10.4 kb ERVK[2-1-LTR]) into the third exon of ALKBH8 (NM_001080341.2 c.267_268indel). Full-length RNA sequencing of cerebellum and liver from a homozygous calf revealed that the endogenous retrovirus (ERV) insertion introduces a cryptic transcription termination signal, truncating ALKBH8 mRNA. This study demonstrates that exploring population-scale genomic data and mining thousands of life-history records, followed by veterinary follow-up evaluations and molecular genetic analyses, provides an effective strategy for identifying cryptic recessive disorders that shorten the lifespan of cattle. The findings provide strong evidence that the ERV insertion into the coding sequence of ALKBH8 represents a loss-of-function variant that causes a previously undescribed recessive disorder that results in increased rearing loss. Interpretive summaryWe identified a recessive disorder in Brown Swiss cattle that causes retarded growth, recurrent infections, immunodeficiency, and increased mortality during the first year of life. Using population-scale genomic data, clinical investigations, and long-read sequencing, we linked the disorder to an exonic transposable element insertion disrupting ALKBH8. The identification of the causal variant now enables direct genetic testing and the implementation of genome-based mating strategies to avoid carrier-by-carrier matings and, consequently, prevent the birth of affected homozygous offspring. We demonstrate the utility of integrating large-scale breeding records, veterinary phenotyping, and advanced genomics to identify hidden defects affecting livestock health and productivity.

2

Genomic insights into bacterial kidney disease resistance in Arctic charr (Salvelinus alpinus) via a 72k SNP array

Palaiokostas, C.; Jeuthe, H.; Nilsson, K. N.; Hallbom, H.; Axen, C.; Evensen, O.; Eriksson, S.; Johnsson, M.

2026-06-27 genetics 10.64898/2026.06.25.734482 medRxiv

Top 0.1%

8.2%

Show abstract

Selection for disease resistance forms one of the most highlighted areas of aquaculture breeding. A breeding program for Arctic charr has been operating in Sweden for over 40 years, making it the oldest of its kind worldwide for this species. However, the lack of available genomic resources prevented selection for any disease-resistance traits. A 72k Axiom SNP array was produced in this study and used to assess the potential to select for charr resistant to bacterial kidney disease (BKD), which is currently a major threat to the industry. Following a challenge experiment with Renibacterium salmoninarum, the causative agent of BKD, relevant phenotypic proxies were collected from approximately 2,000 charr. Thereafter, those animals were genotyped with the new 72k SNP array. The magnitude of the estimated variance components suggested potential for breeding for BKD resistance in charr, with relevant heritabilities ranging from 0.05 to 0.56 depending on the resistance proxy used. In addition, GWAS suggested that BKD resistance is a polygenic trait. Furthermore, genomic prediction approaches indicated that BKD-resistant animals can be identified using their SNP genotypes. Accuracies, expressed as Pearson correlation coefficients, when BKD resistance was analysed as a continuous trait, ranged from 0.42 to 0.52. In the scenario where BKD resistance was treated as a binary trait, the efficiency of genomic prediction was assessed using ROC curves, with an area under the curve of 0.72. Finally, no unfavourable correlations were found with growth traits. The developed 72k SNP array has the potential of being a pivotal tool for the Swedish Arctic charr breeding program. Moreover, our data support the use of genomic prediction in breeding BKD-resistant Arctic charr. As a critical next step, further validations in actual industry conditions would be required.

3

Analysis of genetic variation in the bovine Mannose Receptor gene (MRC1), its influence on receptor expression, and a potential association with resistance to bovine tuberculosis

Holder, A.; Kolakowski, J. F.; Usher, E.; Tzelos, T.; Connelley, T. k.; Shabbir, M. Z.; Gibson, A. J.; Harris, H.; Villarreal-Ramos, B.; Werling, D.

2026-07-03 immunology 10.64898/2026.06.27.734952 medRxiv

Top 0.1%

3.2%

Show abstract

Naturally occurring variation in the bovine mannose receptor C-type 1 gene (MRC1) may shape macrophage responses to Mycobacterium (M.) bovis, a key driver of bovine tuberculosis (bTB). We identified four coding region SNPs in MRC1 across Bos taurus (Holstein Friesian, Brown Swiss) and Bos indicus (Boran, Sahiwal) cattle breeds, including a non-synonymous variant, rs380943118 (c.2963G>A; Ser988Asn) in C-type lectin-like domain (CTLD) 6, most prevalent in Sahiwal cattle. Structural modelling suggested that the S988N substitution, which is spatially separated from the monosaccharide binding site of CTLD4, might indirectly affect glycan binding, perhaps through a conformational change in the receptor. Monocyte-derived macrophages upregulated MR expression during differentiation, with heterozygous (G/A) animals showing higher MR expression and increased uptake of GFP-M. bovis BCG, although differences were not statistically significant. Anti-CD206 blockade did not inhibit BCG internalization, either indicating that this specific antibody did not bind to a CTLD involved in ligand binding or that MR is not the sole entry receptor. These results highlight naturally occurring MRC1 polymorphisms that may influence MR structure and macrophage function, providing a foundation for future studies to assess their role in bTB susceptibility.

4

Genome-wide meQTL mapping in cattle blood reveals cis and trans regulation of DNA methylation

Fouere, C.; Costes, V.; Besnard, F.; Le Danvic, C.; Patry, C.; Fritz, S.; Boussaha, M.; Jouin, M.; Boichard, D.; Kiefer, H.; Costa Monteiro Moreira, G.; Sanchez, M.-P.

2026-07-08 genetics 10.64898/2026.07.07.736355 medRxiv

Top 0.1%

3.1%

Show abstract

Background Complex traits are influenced by numerous variants, most of which have regulatory effects on gene expression that can be mediated by DNA methylation. Molecular QTL mapping is an approach that aims to dissect these effects. However, obtaining molecular phenotypes on a large scale is challenging, particularly in livestock species. In cattle, an epigenotyping array called EpiChip has recently been developed in the European RUMIGEN project. The EpiChip, which contains 43,317 CpG sites distributed all over the bovine genome, enables large-scale measurement of DNA methylation. This study aims to characterize the genetic determinism of blood DNA methylation in cows by estimating heritability and mapping cis- and trans-methylation QTLs (meQTLs). Results Whole blood samples from 4,457 genotyped Holstein cows were epigenotyped. Across all CpG sites, the heritability estimates averaged 24.6%. The local meQTL mapping at sequence-level for variable CpG sites (SD > 2.5%; n = 28,806) detected cis-meQTLs for 80.1% of the CpG sites, with sentinel SNPs located close to their associated CpGs. A two-step analysis was also conducted to identify long-range associations, with a particular focus on trans-meQTL hotspots. First, we identified CpG-SNP trans-associations using medium-density genotypes (50k SNPs) that revealed 31,846 SNPs with significant effects on 1 to 530 trans-CpG sites. Then, regions associated with at least 34 independent trans-CpGs were retained defining 31 hotpots. For each hotspot, a local sequence-level GWAS was conducted using the first principal component derived from the associated trans-CpGs. Out of the 31 detected hotspots, three were located close to transcription factor genes (RUNX1, NFIC and FOXA3) for which the associated trans-CpGs were enriched for the corresponding binding motif. Two other hotspots were located within KDM5A and KDM5B, and their corresponding trans-CpGs were strongly overrepresented in H3K4me3 narrow peaks in blood as well as in other tissues. Conclusions By identifying functional candidate genes associated with blood DNA methylation in cattle, these findings provide new insights into the regulatory architecture of DNA methylation in mammals, highlighting the value of large-scale molecular data from livestock populations.

5

Nitrogen use efficiency in pigs is associated with transcriptomic signatures related to amino acid metabolism, immune activity, and nutrient partitioning

Monney, B.; Ewaoluwagbemiga, E. O.; Kasper, C.

2026-07-01 genomics 10.64898/2026.06.26.733976 medRxiv

Top 0.1%

1.9%

Show abstract

Dietary protein restriction challenges the allocation of amino acids to growth and other physiological functions and therefore requires coordinated metabolic adaptation. Domestic pigs provide an informative system in which to study such responses, because nitrogen retention directly affects lean growth and can be quantified accurately under controlled feeding and housing conditions. Under reduced-protein diets, pigs differ in how effectively they retain nitrogen, and this variation has a genetic basis, making them well suited to investigate the molecular regulation of nitrogen use efficiency (NUE). Here, we characterise differential gene expression and enriched pathways in liver and skeletal muscle of more than 80 pigs with two divergent NUE phenotypes (high and low) maintained under the same protein-reduced, ad libitum dietary conditions. The two NUE phenotypes were clearly distinct at the transcriptomic level, with 177 differentially expressed genes in the liver and 133 in the muscle. In the liver, differential expression and enrichment analyses indicate reduced amino acid catabolism, lower inflammatory and detoxification activity, and a metabolic state that favours lipid processing and insulin-related regulation over the use of amino acids as energy sources. In skeletal muscle, they point to reduced lipid uptake, lower reliance on amino acid oxidation, and a greater emphasis on protein synthesis, translational regulation, mitochondrial energy metabolism, and growth-related processes. These gene-level patterns were supported and extended by pathway and gene-set enrichment analyses. Together, the results suggest that high and low-NUE pigs differ through coordinated, tissue-specific molecular adaptations. Overall, variation in NUE appears to reflect coordinated, tissue-specific differences in how nutrients are allocated between energy use, storage, and lean tissue growth.

6

Variation in AMY2B Copy Number and Serum Amylase Activity in Wolves (Canis Lupus), Brown Bears (Ursus arctos), and Red Foxes (Vulpes vulpes) from Bosnia and Herzegovina

Katica, J.; Crnkic, C.; Kavazovic, A.; Tahirovic, D.; Pojskic, N.; Skapur, V.; Koro - Spahic, A.; Varatanovic, M.; Goletic, T.

2026-07-14 genetics 10.64898/2026.07.09.737415 medRxiv

Top 0.1%

1.5%

Show abstract

The AMY2B gene encodes pancreatic amylase, a critical enzyme for starch digestion. While previous studies have examined AMY2B copy number variation (CNV) in domestic and some wild animals, less is known about wild carnivores inhabiting regions with limited anthropogenic starch exposure. We analyzed blood samples for serum amylase activity and copy number variation in AMY2B gene from 8 wolves (Canis lupus), 11 brown bears (Ursus arctos), and 3 red foxes (Vulpes vulpes) from Bosnia and Herzegovina. AMY2B gene copy number was assessed using droplet digital PCR (ddPCR), and serum amylase activity and glucose levels were quantified. Although the number of fox samples was limited, foxes and wolves consistently harbored two copies of AMY2B, while brown bears exhibited higher CNV (3.67-8.40, mean 5.88). Serum amylase activity was highest in foxes, moderate in wolves, and variable but lower in bears. Despite differences in AMY2B copy number and serum amylase activity, circulating glucose concentrations did not differ significantly among species. Our findings suggest that variation in AMY2B copy number among wild carnivores may be associated with species-specific evolutionary histories and dietary adaptations, providing insight into genomic mechanisms underlying carbohydrate utilization in natural populations.

7

Two-tower models for genomic prediction of reproductive outcomes and sex-specific fertility liabilities: simulation insights

Pappas, F.; Palaiokostas, C.; Debes, P. V.; Johnsson, M.

2026-07-09 genetics 10.64898/2026.07.03.736358 medRxiv

Top 0.1%

1.4%

Show abstract

Many biological characteristics arise by interactions between more than one biological organism or unit. Fertilization success in sexually reproducing species represents such an extended phenotype where both mates are required to be fertile for a successful outcome. Consequently, predictive models should account for the joint nature of reproductive performance while offering interpretable estimates for individual mate contributions. Recent advances in genomics and machine learning (ML) provide standardized, high-dimensional genetic information on one hand and computational tools capable of modeling complex biological systems on the other. Here, we construct and evaluate two-tower (TT) machine learning architectures for genomic prediction of binary reproductive outcomes and recovery of sex-specific fertility liabilities. Simulated datasets, generated under a range of genetic architectures, were utilized to compare multilayer perceptron (TT-MLP), convolutional neural network (TT-CNN), and L1-regularized linear (TT-LASSO) two-tower models. Simulation scenarios varied sex-specific heritabilities, genetic correlations, infertility prevalence, mating structure, and sex-specific infertility rates. Models were evaluated with regard to their ability to predict reproductive success at pair level and also recover true underlying genetic values for male and female fertility. Prediction accuracy increased with the underlying heritable component as expected, while sex-specific tower-scores successfully recovered latent fertility liabilities despite models being trained only on observed joint outcomes. TT-LASSO achieved the highest overall classification performance, whereas TT-MLP provided more balanced and consistent recovery of sex-specific genetic values across scenarios. An additional simulation, incorporating genotype-dependent mate compatibility demonstrated advantages of fully-connected neural networks for capturing non-additive interactions. These results indicate that two-tower frameworks provide a powerful approach for modeling reproductive traits, enabling simultaneous prediction of aggregate reproductive outcomes and sex-specific fertility liabilities from genotypic information.

8

Enhancing predictive accuracy of yield traits in cassava through multi-trait genomic prediction

de Freitas, G. M.; Certuche, D. S.; Jannink, J.-L.; de Oliveira, E. J.; Garcia, A. A. F.

2026-07-06 genetics 10.64898/2026.07.01.735838 medRxiv

Top 0.2%

1.1%

Show abstract

Multi-trait genomic prediction offers a practical route to improve selection for costly, complex traits in clonally propagated crops such as cassava. In a Brazilian breeding panel of 1,078 cassava clones genotyped with 25,923 SNPs and phenotyped for six agronomic traits, we compared single-trait (ST) and multi-trait (MT) GBLUP models. Stage-wise mixed models produced BLUEs that fed into ST and MT-GBLUP. We tested five cross-validation schemes that mimic breeder realities: ST baseline (CV1); naive all-traits MT prediction for unphenotyped candidates (CV2); MT prediction using auxiliary trait phenotypes in the test set (CV3); and two sparse-phenotyping regimes with missingness by trait (CV4) or by clone (CV5) at 25%, 50%, and 75% levels. The main results were that, under the ST baseline (CV1), predictive ability ranged from 0.50 for DMC and 0.45 for FRY down to 0.13 for Le.Dis. A naive full MT model (CV2) performed approximately on par with ST-GBLUP. In contrast, MT designs (CV3) that included informative auxiliary traits, such as shoot yield and combinations with plant vigor and leaf disease severity, yielded small gains for DMC with predictive ability of approximately 0.51 (+2%), while FRY predictive ability increased to approximately 0.65 (+44%), accompanied by RMSE reductions for FRY up to approximately 13.5% (e.g. RMSE approximately 6.2). Sparse-phenotyping simulations (CV4/CV5) demonstrated that MT models sustain or even improve predictive ability under realistic missing-data regimes (PA {approx} 0.62 - 0.65). Selection concordance between MT and ST top-10% sets was generally high (>0.80), and MT configurations produced measurable improvements in expected selection response and genetic gain per cycle for several target traits. These results indicate that strategically implemented MT-GBLUP, using a small set of biologically and operationally informative auxiliary traits and optimized sparse phenotyping, can materially increase predictive accuracy and selection efciency for economically critical cassava traits while reducing phenotyping burden.

9

A gapless Landrace pig genome resolves centromeres and telomeres and highlights telomere repeat structures in different pig breeds

Grove, H.; Stenlokk, K. S. R.; Lien, S.; Gjuvsland, A. B.; Arnyasi, M.; van Son, M.; Kent, M.

2026-06-30 genomics 10.64898/2026.06.25.734473 medRxiv

Top 0.2%

1.0%

Show abstract

Abstract The Duroc-derived reference genome Sscrofa11.1 has provided a critical foundation for pig genomics, providing a high-quality reference genome for accurate variant detection and comparative genomics but does not capture breed-specific variation. Here, we present a near-complete, gap-free genome assembly for the Landrace pig (Landrace_v1, GCA_963921485.1), spanning all 20 chromosomes and totaling 2.6 Gb, including 176 Mb of sequence absent from Sscrofa11.1. Comparative analyses with recently published high-quality pig genomes reveal a conserved centromere organization across breeds, accompanied by substantial variation in repeat composition and length, and identify a pig specific pattern of telomere variant repeats across eight pig breeds. The improved resolution of repetitive regions in Landrace_v1 enables more complete reconstruction of complex gene families, including olfactory receptors, and uncovers structural variation at the KIT proto-oncogene receptor tyrosine kinase locus not represented in the Duroc reference. Together, these findings highlight the limitations of single-reference genomes and demonstrate the value of breed-specific assemblies for capturing genomic diversity and improving downstream analyses.

10

Comparison of localGEBV and Optimal Haplotype Stacking Fitness Functions using a Novel R Package: HapSelect

Shaffer, W.; Papin, V.; Carter, Z.; Brunner, S. M.; Tong, J.; Villiers, K.; Robinson, H.; Voss-Fels, K.; Hayes, B. J.; Hickey, L.; Dinglasan, E.

2026-07-13 genetics 10.64898/2026.07.08.737160 medRxiv

Top 0.2%

0.6%

Show abstract

Haplotype-based breeding strategies have emerged as promising approaches to maximize long-term genetic gain by identifying complementary parental combinations while maintaining genetic diversity. However, these methods typically require phased genotypes and more intensive workflow pipelines and skillsets. We developed a novel local genomic estimated breeding value (localGEBV) fitness function with similar intent to the optimal haplotype stacking (OHS) framework fitness function and implemented both in the novel R package, HapSelect. Our aim was to evaluate whether phased haplotypes provide additional benefit over the more easily available dosage-based unphased genotypes in highly inbred crops. A subset of bread wheat nested association mapping (NAM) population comprising 444 lines genotyped with 6,054 DArT-Seq markers was analysed. Marker effects were estimated using rrBLUP, localGEBV and haplotype effects were calculated across linkage disequilibrium-defined haploblocks, and genetic algorithms (GA) were used to identify optimal sets of 30 founders using either a localGEBV derived fitness function with unphased, dosage inputs or the OHS fitness function with phased inputs. Selected parental sets were compared with conventional truncation selection (TS) through 150 generations of forward simulation. The OHS fitness function achieved a marginally greater optimized ultimate GEBV than the localGEBV fitness function during GA optimization, with only 18 of the 30 selected founders overlapped between the two methods. Despite these differences, forward simulations demonstrated nearly identical long-term genetic gain for localGEBV and OHS-selected founders, with both approaches outperforming conventional truncation selection by maintaining greater genetic diversity and delaying the genetic plateau. The minimal difference between localGEBV and OHS is likely attributable to the high homozygosity of the population, where localGEBV and haplotype effects are nearly confounded. These results demonstrate that dosage-based localGEBV provides a practical alternative to phased haplotype approaches for parent selection in inbred crops, substantially simplifying genomic workflows while maintaining long-term breeding performance. Future work should evaluate these methods in more diverse inbred populations and outbred species, where great haplotypic diversity may increase the advantage of true haplotype-based optimizations.

11

An axiomatic approach to cultivar ranking in multi-environment trials

Kondratev, A. Y.; Ianovski, E.; Voronina, E.; Crossa, J.

2026-07-01 genetics 10.64898/2026.06.27.734959 medRxiv

Top 0.3%

0.5%

Show abstract

Multi-environment trials are central to cultivar evaluation because they reveal how candidate cultivars perform across locations, years, management conditions, and stress environments. The resulting yield matrix is a rich source of data on genotype-by-environment interaction, and a wide literature on estimation, decomposition, visualisation, and prediction of yield potential and stability has flourished. However the ultimate question of which cultivar to recommend on the basis of such a matrix is often left implicit. The question is far from trivial, and in this paper we formulate cultivar recommendation as an axiomatic ranking problem. This framework is rich enough to encompass the existing literature on stability indices, as well as any other deterministic ranking procedure. We show that many commonly used stability-based procedures can violate minimal criteria of efficiency or consistency. The result of such violations is that a cultivar with uniformly high yield could be ranked below a cultivar with uniformly low yield, or the relative ranks of two cultivars could depend on whether or not a third cultivar is present in the matrix. Our results prove that under a small number of such criteria the space of admissible rules collapses to the family of power means and their limiting cases. If we further wish to allow multiplication normalisation of yield, we are left with the geometric mean as the unique solution.

12

Modeling population control via tunable sex ratio distorter gene drives in Aedes aegypti

Childs, L. M.; Shabani, S.; Tauber, U.; Tu, Z.

2026-07-09 genetics 10.64898/2026.07.05.736587 medRxiv

Top 0.4%

0.3%

Show abstract

Aedes aegypti is a major vector of arboviruses, and belongs to subfamily Culicinae, a diverse group of mosquitoes with homomorphic sex-determining chromosomes. Males are the heterogametic sex with a dominant male-determining locus (M locus). The M locus and its counterpart m locus are embedded in a region of suppressed recombination, with a large portion of this recombination desert showing significant molecular differentiation despite homomorphy. We developed a mathematical framework to examine M-linked genome editors that specifically target the m-chromosome during spermatogenesis, mimicking the naturally occurring sex ratio distorters (SRDs) in Culicinae that produce male-biased meiotic drives. Unlike previous models for species with heteromorphic sex chromosomes (e.g., X and Y), we incorporate features stemming from the homomorphic nature of the Ae. aegypti sex chromosomes such as varied linkage to the M locus, making the degree of super-Mendelian inheritance readily tunable. We evaluated in silico SRDs with a range of M-linkage and editing efficiencies and established the theoretical foundation for developing highly efficient SRDs that outperform several methods of population suppression. These SRDs can be tuned to mitigate impact on a neighboring population. The framework developed here is suitable for exploring SRD-mediated genetic biocontrol of pests with homomorphic sex chromosomes.

13

A Draft Male Genome Assembly of the Slipper Lobster (Thenus australiensis) Reveals an XY System and a Validated Diagnostic Marker for Monosex Aquaculture.

Tran Nguyen, A. H.; Ha, G.-H.; Tran, D.-P.; Le, N. T.; Glendining, S.; Fitzgibbon, Q.; Herzig, V.; Luu, P.-L.; Ventura, T.

2026-06-29 genomics 10.64898/2026.06.24.734161 medRxiv

Top 0.4%

0.3%

Show abstract

The slipper lobster (Thenus australiensis) is rapidly emerging as a high-potential species for commercial aquaculture. Because females exhibit superior growth characteristics due to less frequent moulting after sexual maturity, developing monosex breeding strategies is highly desirable for industry profitability. However, the lack of genomic resources and early sex-identification tools has hindered this development. Here, we report the first draft male genome assembly for T. australiensis, generated using a combination of whole-genome shotgun sequencing, DArT-seq, and multi-tissue transcriptomics. The curated assembly spans 0.913 Gbp with high functional completeness (93.0% BUSCO), providing a robust repertoire of 30,100 protein-coding genes. Through k-mer subtraction and population-level DArT-seq genotyping, we provide definitive evidence that T. australiensis utilizes an XX/XY sex-determination system. Crucially, by identifying male-specific structural variations within a neo-Y locus, we developed a diagnostic PCR assay targeting a male-exclusive sequence. This 171 bp marker achieved 100% accuracy in phenotypic sex identification across wild-caught populations. Ultimately, these foundational genomic resources, combined with a highly reliable molecular sexing tool, provide the critical framework necessary for early sex sorting, broodstock management, and the commercial advancement of monosex slipper lobster farming.

14

Nemo2.4: fast and accurate quantitative genetics forward-time simulations

Guillaume, F.; Cotto, O.; Chebib, J.; Beeravolu Reddy, C.; Schmid, M.

2026-07-08 evolutionary biology 10.64898/2026.07.02.736177 medRxiv

Top 0.5%

0.2%

Show abstract

We present Nemo 2.4, an advanced forward-time individual-based simulation framework designed to model the complex eco-evolutionary dynamics and genetic basis of quantitative traits. This tool addresses current challenges in evolutionary quantitative genetics by providing unprecedented flexibility and computational efficiency. Nemo 2.4's modular architecture allows researchers to design custom life cycles by combining specialized Life Cycle Event (LCE) modules, from reproduction and dispersal to selection, crossing, and phenotype expression. The software supports diverse population models, including both Wright-Fisher (WF) and non-WF dynamics, spatially explicit models, and varying demography. Nemo 2.4 handles a wide range of genetic architectures, including both multi-allelic Quantitative Trait Loci (QTL) for general trait studies, and dense di-allelic Quantitative Trait Nucleotides (QTN) implemented with highly optimized bit-wise data structures. Crucially, it allows the simulation of QTNs on comprehensive genetic maps that incorporate other genetic elements, providing genomic-scale resolution. Key biological complexities are integrated natively: the model accommodates modular pleiotropy, dominance, and pairwise epistasis across multiple traits, facilitating the study of complex genotype-phenotype mappings. Furthermore, Nemo 2.4 models phenotypic plasticity through reaction norms and incorporates underlying liability thresholds, enabling the simulation of environmental influences on trait evolution with various forms of selection (e.g., Gaussian, linear, truncation). Due to its compiled design and memory-efficient data representations for large numbers of loci, Nemo provides a robust platform for running high-throughput simulations critical for testing theoretical predictions in polygenic adaptation and understanding evolutionary responses to changing environments.

15

Can exercise training improve mitochondrial thermal responses in rainbow trout cardiomyocytes?

Prescott, L.; Le, T.; Seppanen, E.; Henttinen, T.; Anttila, K.

2026-07-01 physiology 10.64898/2026.06.26.734741 medRxiv

Top 0.5%

0.2%

Show abstract

Climate-driven warming is challenging the physiological limits of aquatic ectotherms, with cardiac performance emerging as one of the key determinants of thermal tolerance. Cardiac function relies on mitochondrial ATP production, and mitochondrial dysfunction has been linked to cardiac failure at critical temperatures. However, mitochondria are plastic and may represent a target for interventions aimed at improving thermal tolerance in fish. Exercise training improves whole-animal performance in fish, including cardiac thermal performance, and improves mitochondrial function in other taxa. However, its effects on the thermal sensitivity of cardiac mitochondria remain unknown. This study investigated whether exercise-training alters cardiac mitochondrial performance at optimal and critical temperatures in rainbow trout Oncorhynchus mykiss. Farmed rainbow trout were subjected to a four-week exercise training regime, while control fish remained under standard rearing conditions. Cardiac mitochondrial respiration was assessed in permeabilised heart fibres at 16{degrees}C (optimal growth temperature) and 26{degrees}C (temperature associated with cardiac arrhythmia) and several biochemical and nuclear indicators were measured. No significant differences were detected between treatments for any measured variable. However, trained fish generally exhibited higher maximal respiratory capacities and respiratory control ratios, particularly at the elevated temperature, suggesting subtle improvements in mitochondrial function despite considerable inter-individual variation. Temperature influenced mitochondrial performance, increasing proton leak and reducing coupling efficiency. These findings demonstrate that cardiac mitochondrial function is thermally sensitive and represents a potential targeted for improving thermal resilience in aquaculture species.

16

A cryptic local genetic cluster in Northern France amid the European mosaic of flat oyster lineages revealed by integrating SNP array and whole-genome sequencing

Lapegue, S.; Cornette, F.; Heurtebise, S.; Pouvreau, S.; Carpentier, C.; Colston-Nepali, L.; Bierne, N.; Reisser, C.

2026-06-28 genetics 10.64898/2026.06.26.734753 medRxiv

Top 0.6%

0.1%

Show abstract

The European flat oyster (Ostrea edulis), like numerous other oyster species, has been exploited for millennia and cultivated and translocated for centuries. Following a severe population decline, and in the context of ongoing conservation and restoration programs, genetic considerations must now be addressed to avoid mistakes. The objective of our study was to complement population genetic studies conducted at various scales along European coasts. Our sampling primarily targeted the French Atlantic, English Channel, and Mediterranean coasts, aiming to provide a fine-scale genetic characterization of populations in these regions. By integrating SNP array and low-coverage sequencing datasets, we obtained a comprehensive overview of the population genetic structure of Ostrea edulis across western Europe. Most previously identified clusters in Western Europe were confirmed. In France, populations assigned to these clusters exhibited notable within-patch homogeneity. However, two key findings emerged: (1) an extensive overlap zone between the Atlantic and western Mediterranean clusters, spanning at least from southern Portugal to southern France, and (2) the detection of a novel, clearly distinct cryptic cluster east of the English Channel, whose geographic range remains to be better delineated. These insights are critical for informing management decisions, particularly as restoration and conservation plans are currently being implemented across the species range.

17

Linking geography and mutation profiles across goat species

Bionda, A.; Crepaldi, P.; Prendergast, J. G. D.; Neupane, M.; Amills, M.; Rosen, B. D.; Tosser-Klopp, G.; Milanesi, M.; Talenti, A.; The VarGoats Consortium,

2026-07-14 genomics 10.64898/2026.07.09.737241 medRxiv

Top 0.6%

0.1%

Show abstract

Recent studies have characterised the mutational profile across multiple mammalian species, highlighting substantial differences across lineages. However, none of these studies investigated whether mutation profiles and geography are significantly correlated. In this study, we present a multi-genome alignment spanning several Capra taxa, reconstruct the ancestral genome of Capra hircus and use it to characterize the mutational profiles across multiple Capra species by using the 1000 genomes VarGoats dataset. Results confirmed that the scale of differences among Capra species largely reflects their phylogenetic relationships, in particular with the Bezoar being genetically closer to domestic goats than to other wild species. Subsequently, we correlated the mutational profile and the geographical origin of the different individuals. In particular, ACG>ATG changes have the strongest correlation with longitude (r = -0.79, P-value = 3.02*10-204), while TCA>TGA are strongly correlated with latitude (r = -0.51, P-value = 4.30*10-63). We highlight how sequential dinucleotide mutations (SDMs) place cosmopolitan breeds closer to the sampling location, rather than the country of origin, showing how the recent relocation of cosmopolitan breeds to new continents is reshaping the genome of these animals. Finally, we used the mutational profile to predict the coordinate of origin of each animal in the dataset. In conclusion, we show the important role that geography had in shaping the genomes of domestic goats.

18

GenoSim: A Forward-Time Genotype Simulator for Clinical and Population Genetics with Population Stratification

Bakar, A.; Gul, R.; Haq, W. u.; Afghani, T.

2026-06-25 bioinformatics 10.64898/2026.06.20.733503 medRxiv

Top 0.6%

0.1%

Show abstract

Motivation: Next-generation sequencing studies in clinical genetics are often limited by the scarcity of human genotype data, which stems from ethical, regulatory, and economic barriers. The shortfall is sharpest in consanguineous populations, which are common in South Asia and the Middle East, where family-based designs need large pedigrees that are rarely sequenced in full. Existing simulators do not combine pedigree-aware propagation, realistic population stratification, and clinical export formats in one tool. Results: We present GenoSim, an R package for forward-time simulation of diploid SNP genotypes. It runs in two modes: a population mode implementing inbreeding-adjusted Hardy-Weinberg sampling, Wright-Fisher drift, directional selection, recurrent mutation, and Haldane recombination across multiple generations; and a pedigree-constrained mode that ingests real family VCFs and a pedigree, reconstructs phase where the pedigree makes it identifiable, propagates genotypes through the observed family structure, and appends synthetic generations. Version 1.1.1 adds population stratification through the Balding-Nichols model parameterised by gnomAD v3.1 fixation indices (F_ST) for eight ancestry groups (AFR, AMR, EAS, EUR, FIN, MID, SAS, ASJ), empirical allele-frequency loading from external reference panels, and admixed-cohort simulation. Analysis functions cover Hardy-Weinberg testing, linkage disequilibrium, runs of homozygosity, principal component analysis, founder-referenced and between-generation F-statistics, and Nei gene diversity. Availability and implementation: GenoSim is available as an R package at https://github.com/malikbak/GenoSim under the MIT licence. It requires R [≥] 4.0.0 and depends only on base R packages (stats, utils, graphics, grDevices, tools).

19

Recommendations for the ethical and accurate use of population descriptors: a trainee-led survey of early-career researchers

Sharma, J.; Maldonado, B.; Ungar, R. A.; Adimoelja, A.; Flores, J.; Gjorgjieva, T.; Jones, K.; Khan, A.; Xue, D.; Patel, R.; Caggiano, C.

2026-07-07 genetics 10.64898/2026.07.01.735829 medRxiv

Top 0.6%

0.1%

Show abstract

Despite the importance of population descriptors in human genomics research, many scientists struggle to translate evolving ethical guidelines into their computational workflows. To characterize this gap between recommendations and implementation, we conducted a mixed-methods survey of early-career researchers to assess how they understand and implement the landmark 2023 NASEM report on the use of population descriptors in human genetics research. We show that while exposure to the report fosters ethical awareness, fundamental misconceptions about race and ancestry persist across academic disciplines, and trainees face structural bottlenecks, including legacy data constraints and a lack of technical confidence. To address this gap, we offer actionable, stakeholder-specific recommendations across the research lifecycle ranging from decision-support tools to "bring-your-own-data" workshops to leadership from academic journals, scientific societies, and trainee mentors. Ultimately, we argue that to promote scientific rigor and reduce bias in genetic discoveries, the scientific ecosystem must invest in the infrastructure necessary to empower the next generation of researchers.

20

Repair outcomes after germline homing endonuclease cleavage in Anopheles gambiae inform the design of synthetic gene drives

Naujoks, D.; Nolan, T.

2026-06-23 genetics 10.64898/2026.06.23.733901 medRxiv

Top 0.7%

0.1%

Show abstract

Homing endonuclease genes spread by cleaving homologous chromosomes that lack the endonuclease cassette, after which repair from the endonuclease-containing chromosome converts the cut allele into a copy of the drive allele. This mechanism has provided a conceptual foundation for synthetic gene drive systems, including CRISPR-based drives, that represent promising strategies for the genetic control of insect pests. However gene drive performance depends critically on the repair pathways available in the germline of the target organism. Here, we report a set of transgenic assays originally developed as part of an attempt to establish gene targeting in the malaria mosquito Anopheles gambiae using an in vivo-generated linear targeting molecule. Although the intended FLP-mediated excision step was not achieved in the mosquito germline, analysis of the component strains revealed efficient germline activity of the rare-cutting homing endonuclease I-SceI and a striking bias towards homology-based repair of I-SceI-induced double-strand breaks. Across reporter and donor configurations, cleavage outcomes were dominated by single-strand annealing, microhomology-mediated repair, synthesis-dependent strand annealing and gene conversion-like events, with comparatively limited evidence for classical non-homologous end joining. In reciprocal crosses designed to distinguish gene conversion from gamete loss, I-SceI cleavage also produced inheritance distortion consistent with both conversion of the cleaved allele and reduced recovery of gametes carrying extensively damaged donor alleles. These findings indicate that the An. gambiae germline can strongly favour homology-dependent repair following homing endonuclease cleavage and that cleavage can also generate meiotic drive-like distortion through selective loss of damaged gametes. The results have direct relevance for the design and interpretation of homing endonuclease and CRISPR-based gene drives in malaria mosquitoes, where the balance between homology-directed repair, end joining and gamete viability will determine drive efficiency, resistance formation and transmission bias. Author summaryGene drives depend on a simple but demanding principle: a nuclease cuts one chromosome, and the cell repairs the break using the homologous chromosome as a template, copying the drive element in the process. Before CRISPR, this type of system was explored using naturally occurring homing endonucleases such as I-SceI. We attempted to develop a gene targeting system in Anopheles gambiae based on the Rong and Golic strategy, in which FLP recombinase would excise a donor molecule and I-SceI would linearise it to stimulate recombination. The full knockout technology did not work because FLP-mediated excision was not detected in the mosquito germline. However, the component tests revealed something more broadly important: I-SceI-induced breaks were repaired predominantly through homology-based pathways rather than simple end joining. We also observed inheritance distortion consistent with both gene conversion and loss of damaged gametes. These results help explain why homing-based systems can work in mosquitoes, while also highlighting why repair pathway choice and gamete viability need to be measured directly in any new drive configuration.